feat: add workflow stage resume#747
Conversation
|
MkDocs preview: https://7d708f5e.dd-docs-preview.pages.dev Fern preview: https://nvidia-preview-pr-747.docs.buildwithfern.com/nemo/datadesigner
|
Greptile SummaryThis PR implements stage-level resume for
|
| Filename | Overview |
|---|---|
| packages/data-designer/src/data_designer/interface/composite_workflow.py | Core resume implementation: adds prior metadata reading, fingerprint-based stage skipping, partial-stage delegation, downstream invalidation via force_rerun_downstream, atomic metadata writes, and path relativisation for portability. Logic traces correctly through all identified paths. |
| packages/data-designer/tests/interface/test_composite_workflow.py | Adds 11 new resume tests covering skip, rerun, partial/failed-stage delegation, callback-output invalidation, output-processor skip, moved-artifact portability, empty-stage propagation, corrupt-metadata fallback/strict, and ALWAYS-mode rejection. _mark_stage_resumable correctly strips completion-only fields to match real interrupted-run metadata shape. |
| docs/concepts/workflow-chaining.md | Adds resume section and removes "not implemented yet" bullet; docs match the implementation's IF_POSSIBLE/ALWAYS semantics. |
| fern/versions/latest/pages/concepts/workflow-chaining.mdx | Mirrors the MkDocs resume section for Fern; identical prose and code snippet, consistent with implementation. |
| plans/workflow-chaining/workflow-chaining.md | Updates plan status section to reflect completed stage-level resume slice; still-deferred items unchanged. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[workflow.run resume=X] --> B[Read prior workflow metadata]
B --> C{resume=NEVER or\nno metadata?}
C -- yes --> D[prior_metadata = None]
C -- no --> E[prior_metadata loaded]
D & E --> F[For each stage]
F --> G{skipped_upstream_stage\nset?}
G -- yes --> H[Mark stage skipped_empty_upstream\ncontinue]
G -- no --> I[Compute stage_fingerprint]
I --> J{prior_matches?\nnot force_rerun_downstream\nAND fingerprint match}
J -- yes --> K{_can_skip_prior_stage?}
K -- yes --> L[Restore from prior metadata\nset previous_seed_path / fingerprint\nif completed_empty: set skipped_upstream_stage\ncontinue]
K -- no --> M{prior status in\nRESUMABLE_STAGE_STATUSES\nAND stage_path exists?}
M -- yes --> N[stage_resume = ALWAYS]
M -- no --> O{resume=ALWAYS\nAND NOT force_rerun_downstream?}
O -- yes --> P[raise DataDesignerWorkflowError]
O -- no --> Q[stage_resume = NEVER\ndelete stage_path if exists]
N --> R[Run stage via DataDesigner.create\nresume=stage_resume]
Q --> R
J -- no --> O
R --> S{stage has\noutput_processors?}
S -- yes --> T[Delete output-processor dir\nRun output-processor create fresh]
S -- no --> U[Determine output_seed_path\nvia on_success or output selection]
T --> U
U --> V{output_records == 0?}
V -- yes, allow_empty --> W[status=completed_empty\nset skipped_upstream_stage]
V -- no --> X[status=completed]
W & X --> Y[force_rerun_downstream = True\nwrite metadata\ncontinue to next stage]
Reviews (4): Last reviewed commit: "Merge branch 'main' into andreatgretel/f..." | Re-trigger Greptile
Review: PR #747 —
|
📋 Summary
Adds stage-level resume support for chained workflows so compatible completed stages can be reused, matching partial stages can continue through the existing single-stage resume path, and downstream stages rerun when upstream outputs change.
🔗 Related Issue
N/A
🔄 Changes
CompositeWorkflow.run(resume=...)with completed-stage reuse and partial-stage delegation.ResumeMode.IF_POSSIBLEfall back to fresh runs when prior metadata is unusable.🧪 Testing
make testpassesRan:
.venv/bin/ruff format ..venv/bin/ruff check packages/data-designer/src/data_designer/interface/composite_workflow.py packages/data-designer/tests/interface/test_composite_workflow.py.venv/bin/ruff format --check packages/data-designer/src/data_designer/interface/composite_workflow.py packages/data-designer/tests/interface/test_composite_workflow.py.venv/bin/pytest packages/data-designer/tests/interface/test_composite_workflow.py -q- 55 passed, 2 warnings.venv/bin/pytest /home/ubuntu/Code/reviews/DataDesigner-747/smoke_test.py -q -s- 2 passed against NVIDIA Build (nvidia/nemotron-3-nano-30b-a3b) and NVIDIA Inference (openai/openai/gpt-5.4-nano) using/home/ubuntu/Code/.envNote: full
.venv/bin/ruff check --fix .currently hits an unrelated existing generated-notebook lint indocs/colab_notebooks/7-nemotron-personas.ipynb(F404).✅ Checklist